AITopics | entropy-regularized markov decision process

Collaborating Authors

entropy-regularized markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Planning in entropy-regularized Markov decision processes and games

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

Neural Information Processing SystemsFeb-12-2026, 03:52:48 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, sample complexity, value function, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.42)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Neural Information Processing SystemsDec-25-2025, 09:10:50 GMT

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the SmoothCruiser. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order $\tilde{\mathcal{O}}(1/\epsilon^4)$ for a desired accuracy $\epsilon$, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case.

entropy-regularized markov decision process, markov decision process and game, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Reviews: Planning in entropy-regularized Markov decision processes and games

Neural Information Processing SystemsJan-23-2025, 15:44:33 GMT

This theoretical paper considers the problem of computing optimal value function in entropy-regularized MDPs and two-player games. It shows that the smoothness property of the Bellman operator in the presence of entropy regularized policies (and possibly other forms of regularization), can be used to derive a sample complexity which is polynomial of order O((1/ε) {4 c}), with c being a problem independent constant and ε the precision of the value function estimate. The proof is built upon the proposed algorithm, SmoothCruiser, an algorithm motivated in the sparse sampling algorithm of Kearns et al that recursively estimates V through samples and subsequently aggregates the results. This sampling dynamic programming is done up to a depth when the required number of samples is no longer polynomial. The paper is very well written and provides a solid result.

algorithm, entropy-regularized markov decision process, markov decision process and game, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.38)

Add feedback

Reviews: Planning in entropy-regularized Markov decision processes and games

Neural Information Processing SystemsJan-23-2025, 15:44:23 GMT

The reviewers were in consensus that this is an interesting and well written paper with a significant theoretical contribution. While empirical results should not be strictly required for a paper that is strong theoretically, they would nonetheless greatly improve the paper, and thus the authors are strongly encouraged to include them in the final version, even if they are relegated to supplementary material.

entropy-regularized markov decision process, markov decision process and game

Neural Information Processing Systems

Technology:

Information Technology > Decision Support Systems (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Neural Information Processing SystemsOct-10-2024, 00:24:04 GMT

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the SmoothCruiser. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order \tilde{\mathcal{O}}(1/\epsilon 4) for a desired accuracy \epsilon, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case.

entropy-regularized markov decision process, markov decision process and game, sample complexity, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Grill, Jean-Bastien, Domingues, Omar Darwiche, Menard, Pierre, Munos, Remi, Valko, Michal

Neural Information Processing SystemsMar-19-2020, 01:46:13 GMT

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the SmoothCruiser. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order $\tilde{\mathcal{O}}(1/\epsilon 4)$ for a desired accuracy $\epsilon$, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case. Papers published at the Neural Information Processing Systems Conference.

entropy-regularized markov decision process, markov decision process and game, sample complexity, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback